test(kad): use virtual time to resolve flaky bootstrap test#6361
test(kad): use virtual time to resolve flaky bootstrap test#63610xsamalt wants to merge 7 commits intolibp2p:masterfrom
Conversation
|
This pull request has merge conflicts. Could you please resolve them @0xsamalt? 🙏 |
|
Thank you for this PR @0xsamalt! Deterministic timings in the test scenario would be great. Still, I don't think we should switch the I haven't see the flaky kad test in the wild for a while; did you recently ran into issues with it? |
|
Hey @elenaf9 , That completely makes sense! I agree that testing the exact production dependencies is generally much safer. To answer your question: yes, I did recently run into this flakiness locally. My CPU had a tiny lag spike during the test runner, so a timer that was supposed to hit the assertion at I attempted to introduce Tokio's Since the flakiness only seems to pop up on heavily loaded/laggy local machines, how would you like to proceed?
Let me know what you prefer! |
Description
This PR fixes intermittent CPU-lag flakiness in Kademlia's bootstrap.rs test suite.
The tests previously measured real-world timeouts using
futures_timer::Delayandweb_time::Instant, causing them to fail when CI servers inevitably overshot expectations.By leveraging Tokio's
test-utilfeature and explicitly wrapping our mocks for the tests,#[tokio::test(start_paused = true)]ensures the elapsed test time is completely deterministic and decoupled from wall-clock performance. Tests run instantly and pass 100/100 times in local looping.cc @jxs @elenaf9
Notes & open questions
futures_timer::Delayis preserved under#[cfg(not(test))].Change checklist